Tiling on systems with communication/computation overlap
نویسندگان
چکیده
In the framework of fully permutable loops, tiling is a compiler technique (also known as ‘loop blocking’) that has been extensively studied as a source-to-source program transformation. Little work has been devoted to the mapping and scheduling of the tiles on to physical parallel processors. We present several new results in the context of limited computational resources and assuming communication–computation overlap. In particular, under some reasonable assumptions, we derive the optimal mapping and scheduling of tiles to physical processors. Copyright 1999 John Wiley & Sons, Ltd.
منابع مشابه
Tiling with limited resources
In the framework of perfect loop nests with uniform dependences, tiling has been extensively studied as a source-to-source program transformation. Little work has been devoted to the mapping and scheduling of the tiles on to physical processors. We present several new results in the context of limited computational resources, and assuming communication-computation overlap. In particular, under ...
متن کاملDelivering High Performance to Parallel Applications Using Advanced Scheduling
This paper presents a complete framework for the parallelization of nested loops by applying tiling transformation and automatically generating MPI code allowing for an advanced scheduling scheme. In particular, under advanced scheduling scheme we consider two separate techniques: first, the application of a suitable tiling transformation, and second the overlapping of computation and communica...
متن کاملCommunication Scheduling as a First-Class Citizen in Distributed Machine Learning Systems
State-of-the-art machine learning systems rely on graphbased models, with the distributed training of these models being the norm in AI-powered production pipelines. The performance of these communication-heavy systems depends on the effective overlap of communication and computation. While the overlap challenge has been addressed in systems with simpler model representations, it remains an ope...
متن کاملOptimizing Metacomputing with Communication-Computation Overlap
In the framework of distributed object systems, this paper presents the concepts and an implementation of an overlapping mechanism between communication and computation. This mechanism allows to decrease the execution time of a remote method invocation with parameters of large size. Its implementation and related experiments in the C++// language running on top of Globus and Nexus are described.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Concurrency - Practice and Experience
دوره 11 شماره
صفحات -
تاریخ انتشار 1999